Horton Tables: Fast Hash Tables for In-Memory Data-Intensive Computing

نویسندگان

Alex D. Breslow

Dong Ping Zhang

Joseph L. Greathouse

Nuwan Jayasena

Dean M. Tullsen

چکیده

Hash tables are important data structures that lie at the heart of important applications such as key-value stores and relational databases. Typically bucketized cuckoo hash tables (BCHTs) are used because they provide highthroughput lookups and load factors that exceed 95%. Unfortunately, this performance comes at the cost of reduced memory access efficiency. Positive lookups (key is in the table) and negative lookups (where it is not) on average access 1.5 and 2.0 buckets, respectively, which results in 50 to 100% more table-containing cache lines to be accessed than should be minimally necessary. To reduce these surplus accesses, this paper presents the Horton table, a revamped BCHT that reduces the expected cost of positive and negative lookups to fewer than 1.18 and 1.06 buckets, respectively, while still achieving load factors of 95%. The key innovation is remap entries, small in-bucket records that allow (1) more elements to be hashed using a single, primary hash function, (2) items that overflow buckets to be tracked and rehashed with one of many alternate functions while maintaining a worst-case lookup cost of 2 buckets, and (3) shortening the vast majority of negative searches to 1 bucket access. With these advancements, Horton tables outperform BCHTs by 17% to 89%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cheap and Large CAMs for High Performance Data-Intensive Networked Systems

We show how to build cheap and large CAMs, or CLAMs, using a combination of DRAM and flash memory. These are targeted at emerging data-intensive networked systems that require massive hash tables running into a hundred GB or more, with items being inserted, updated and looked up at a rapid rate. For such systems, using DRAM to maintain hash tables is quite expensive, while on-disk approaches ar...

متن کامل

Stack and Queue Integrity on Hostile Platforms

When computationally intensive tasks have to be carried out on trusted, but limited, platforms such as smart cards, it becomes necessary to compensate for the limited resources (memory, CPU speed) by o loading implementations of data structures on to an available (but insecure, untrusted) fast co-processor. However, data structures such as stacks, queues, RAMS, and hash tables can be corrupted ...

متن کامل

Resizable, Scalable, Concurrent Hash Tables via Relativistic Programming

We present algorithms for shrinking and expanding a hash table while allowing concurrent, wait-free, linearly scalable lookups. These resize algorithms allow ReadCopy Update (RCU) hash tables to maintain constanttime performance as the number of entries grows, and reclaim memory as the number of entries decreases, without delaying or disrupting readers. We call the resulting data structure a re...

متن کامل

Cuckoo++ Hash Tables: High-Performance Hash Tables for Networking Applications

Hash tables are an essential data-structure for numerous networking applications (e.g., connection tracking, rewalls, network address translators). Among these, cuckoo hash tables provide excellent performance by allowing lookups to be processed with very few memory accesses (2 to 3 per lookup). Yet, for large tables, cuckoo hash tables remain memory bound and each memory access impacts perform...

متن کامل

Efficient implementation of error correction codes in hash tables

Hash tables are one of the most commonly used data structures in computing applications. They are used for example to organize a data set such that searches can be performed efficiently. The data stored in a hash table is commonly stored in memory and can suffer errors. To ensure that data stored in a memory is not corrupted when it suffers errors, Error Correction Codes (ECCs) are commonly use...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Horton Tables: Fast Hash Tables for In-Memory Data-Intensive Computing

نویسندگان

چکیده

منابع مشابه

Cheap and Large CAMs for High Performance Data-Intensive Networked Systems

Stack and Queue Integrity on Hostile Platforms

Resizable, Scalable, Concurrent Hash Tables via Relativistic Programming

Cuckoo++ Hash Tables: High-Performance Hash Tables for Networking Applications

Efficient implementation of error correction codes in hash tables

عنوان ژورنال:

اشتراک گذاری